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THE EMERGENCE OF THE DETERMINISTIC 
HODGKIN HUXLEY EQUATIONS AS A LIMIT FROM THE 
UNDERLYING STOCHASTIC ION-CHANNEL MECHANISM 

By Tim D. Austin 
University of California at Los Angeles 

In this paper we consider the classical differential equations of 
Hodgkin and Huxley and a natural refinement of them to include a 
layer of stochastic behavior, modeled by a large number of finite- 
state-space Markov processes coupled to a simple modification of 
the original Hodgkin-Huxley PDE. We first prove existence, unique- 
ness and some regularity for the stochastic process, and then show 
that in a suitable limit as the number of stochastic components of 
the stochastic model increases and their individual contributions de- 
crease, the process that they determine converges to the trajectory 
predicted by the deterministic PDE, uniformly up to finite time hori- 
zons in probability. In a sense, this verifies the consistency of the 
deterministic and stochastic processes. 

1. Introduction: Ion channels of excitable membranes. Most neurons in 
most organisms have an axon: a long, narrow conduit connecting the central, 
roughly spherical part of the cell (the soma) to a network of smaller branches 
and ultimately to the synapses, which form connections with other neurons 
(principally at branched projections from the latter called dendrites). The 
axon connects the soma to synapses that may be a great distance away 
(often several cm) relative to the size of the soma or the diameter of the 
axon (typically a few /im). The function of the neuron relies partly on its 
ability to transmit signals from the soma to other neurons over this long 
distance via the axon. 

The nature of these signals had begun to become clear during the 1930s, 
but only with Hodgkin and Huxley's (Nobel Prize-winning) work on the 
mechanism of signal transmission in the squid giant axon in the early 1950s 
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were the first foundations laid of an accurate mathematical model of their 
behavior (see [12]). 

Since then Hodgkin and Huxley's original analysis has been extended 
and refined repeatedly. The mathematics underlying the resulting models 
has been studied for the sake of both more accurate numerical modeling 
and better theoretical understanding. In particular, Hodgkin and Huxley's 
empirical, deterministic model has been refined to a model of the axon in 
which the relevant behavior arises from the combined contributions of a 
large number of small stochastic components. 

Since the present paper is primarily about mathematics, we will assume fa- 
miliarity with the basic physiological origins of the deterministic and stochas- 
tic models (in particular, the working of voltage-dependent ion channels and 
the action potential). In these terms we will give a brief motivation for the 
two models in Section 2.1. However, once we have reached the definitions of 
the equations themselves further references to the physiology will be periph- 
eral, and not important for understanding the paper. A thorough treatment 
of this physiology can be found in Hille's classic text [11], while a more 
mathematical description of various such models can be found in Cronin [2] . 

It is worth noting that, while the deterministic mathematical model has 
been studied intensively (particularly for numerical computation purposes), 
the results for the stochastic model are fairly few. As far as I am aware, the 
pure mathematics behind the stochastic model considered below has never 
been worked out in detail. In this paper we will prove an existence theorem 
for that stochastic process, and, more interestingly, the convergence of its 
various components (such as the function giving the membrane potential 
along the axon at a particular instant in time) to their counterpart trajec- 
tories in the deterministic theory, uniformly up to finite time horizons in 
probability. 

(Results analogous to this have been obtained by Fox and Lu [9] (build- 
ing on a simulation method of DeFelice and Isaac [3] ) for the case in which 
the membrane potential is assumed constant along the entire length of the 
axon at each instant. In this case the partial differential equations we will 
encounter simplify to ordinary differential equations, coupled with a finite 
number of discrete stochastic processes that can then be studied using the 
standard methods of Fokker-Planck and Langevin equations. In fact, this 
simpler case corresponds more closely to the original experimental set up 
of Hodgkin and Huxley, in which a fine conducting silver wire was inserted 
along the axon, causing the membrane potential to adjust to a single com- 
mon value along the axon effectively instantaneously.) 

The consequences of the general stochastic model have received increased 
interest in recent efforts, first by Chow and White [1] and then Faisal, White 
and Laughlin [8], to estimate how much noise the actual stochastic nature 
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can introduce to a real neuron (behavior that would not appear in the de- 
terministic approximation) and what constraints this places on the size of 
the axon if it is to function reliably. We will remark more on this briefly in 
Section 5. 

Remark. When a suitable stimulus is applied (e.g., from the soma at 
one end of the axon), exceeding a certain threshold, the trajectory of the 
potential difference along the axon evolves through a family of subthreshold 
configurations into an action potential. After moving away from its point of 
origin, this trajectory asymptotically takes the special form of a traveling 
wave. Although the possibility of such a traveling wave is key to the axon's 
ability to transmit a signal, we will not refer to it again in this paper. Our 
subsequent convergence results require only the existence of some sufficiently 
regular time evolution of the system given appropriate initial conditions. 

2. The mathematical models. In this section we describe the precise 
mathematical models that we will study. We will assume standard notions 
from stochastic analysis and PDE. 

2.1. Basic components. This subsection assumes some knowledge of the 
physiology of axons; the disinterested reader may skip to the definitions of 
the equations in the next subsection without impediment. 

In microscopic detail, the instantaneous electrical state of the axon de- 
pends on the locations of all the ions in solution inside and outside the axon, 
on the locations and internal states of any molecular mechanisms at work in 
the axon (the ion channels, in particular), and on various other components 
of the system. As usual, we do not actually work at this level of detail, but 
instead make a number of simplifications. However, there is some choice in 
this procedure. We will see that heuristically the two different models to 
be studied arise from two different such approximations, one coarser than 
the other: in particular, the stochastic model describes the working of indi- 
vidual ion channels, whereas the deterministic model "averages out" their 
behavior, involving instead functions that describe the proportion of those 
channels in a small neighborhood of a point that are in each possible state. 

Before explaining this difference in more detail, we will describe some 
simple approximations that are made in both cases. First, although the axon 
is described above as a tube (and actually has a membrane with considerable 
molecular structure of its own and further cellular components within the 
axonal fluid), its diameter is so small compared with its length (typically 
less than 10 iim compared with a few cm) that all the relevant quantities 
vary only negligibly across it, and so we simplify the geometric description 
of the axon to an interval /= [— We write 1° for the interior (—£,i) of 
this interval. 
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It is worth noting that voltage-dependent ion channels embedded in a 
cellular membrane seem to be involved in physiological processes other than 
the working of axons, sometimes in a pattern spread over a nontrivial area, 
rather than the approximately-linear distribution we are assuming here. It is 
possible that the analysis that follows below could be adapted to this case, 
but I have not tried this. For details of such physiology see Hille [11]. 

Second, in both models we also approximate the distribution of the indi- 
vidual ions in space by continuous concentrations, to be described by suitable 
PDE. However, it is possible to go further and avoid altogether the need to 
work with separate data for each different kind of ion. To do this we deal 
instead only with the variation of membrane potential along the axon. That 
this contains all the information we need follows from the further assumption 
that the flows of the different ions across the axonal membrane, although 
enough to give rise to the relevant changes in the membrane potential, are 
negligible compared with the concentration levels that remain both inside 
and outside the axon. This assumption guarantees that the concentration 
gradients change only negligibly during the working of the axon, and there- 
fore that the effect of ion influx and efflux on membrane potential is correctly 
described by an equivalent driving potential for each kind of ion. We will 
omit the relevant mathematical working to establish this description here; 
see page 37 of Hille [11]. 

Thus the state of our system is partly described by a function u : / — > M 
giving the value of the membrane potential at each point along the axon. 
Since ions can diffuse along the axon, the variation of this function with time 
will also exhibit diffusive behavior, allowing us to impose certain regularity 
conditions on it. For simplicity we will assume that the diffusivity constant 
is 1 throughout this paper; a simple scaling of / recovers the general case. 

It will turn out to be suitable that we assume v Lipschitz and in the 
Sobolev space H^. In fact, we will assume a little more for technical sim- 
plicity. In a real axon the equilibrium potential need not be zero; however, 
we can and will shift our origin so that we can treat it as zero. We do this 
because we will find it helpful to impose the condition that our functions 
vanish at zizi and so to restrict attention to potential difference functions in 
Hq{I). If we do not make this change of origin, then our results for such 
functions will certainly still be valid; the problem, rather, is that they will 
no longer apply to the biophysically interesting situation. 

We will write for the potential difference function at time t in the deter- 
ministic model, and for that in the stochastic model. In both cases these 
should evolve following a continuous trajectory in Hq(I) (in the stochastic 
case, this means as a process in this space with continuous sample paths). 

Next we must decide how to model the ion channels and their effect on the 
membrane potential; it is here that our two models of the axon will diverge. 
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We do restrict ourselves in both models to the case in which all ion chan- 
nels are identical, and can be in any of a finite set E of possible channel 
states. On the other hand, it turns out that there is no great increase in 
difficulty if we allow several of the possible states G E to allow the passage 
of ions with different conductivities. In reality there are different kinds of 
channel for different kinds of ion, but one finds at once that the resulting 
mathematical descriptions differ only in notation. Also, in practice there is 
a constant leak conductivity (corresponding to ion flow across the axonal 
membrane other than through channels); for simplicity we ignore this also, 
as we may assume that it has been absorbed by a suitable modification of 
the conductivities of our channels. 

As already mentioned, to each kind of ion there corresponds an equivalent 
potential difference which "drives" the passage of those ions either into or 
out of the axon through their corresponding ion channels. Given our treat- 
ment of all channels as identical, as described above, in our model these 
driving potentials will actually correspond to the different possible chan- 
nel states ^ € (it will follow at once from the form of the equations that 
in this arrangement suitable values of the can be taken as sums of those 
corresponding to different types of ion, and that in the case of a state ^ that 
allows no ions to flow, the value we give to will be of no consequence). In 
our stochastic model a channel at position x will jump between states ^,(^ 
at random at rates a^^i^{V), where V is the value of the potential difference 
at the relevant point x. We will write for the total rate of leaving state 

We will assume that the functions a^^(^ are all smooth and take values 
between two fixed constants in (0,oo) (this certainly holds in the actual 
models that are used). I do not know to what extent this condition could be 
weakened in what follows; certainly some regularity is needed, and we will 
use the finiteness of Lip(a^^^) explicitly. 

Write f_ = min^g^; f^, = max^g^; f^, and assume that < < u+ (this 
is also true in real axons). 

Now we can describe our stochastic model; in fact there will be one such 
model for each N gN. In the A^th member of this sequence, the axon is 
populated by [2A£] — 2 channels at positions -^(Zn A/°), and each has the 
normalized ion conductivities j^c^ corresponding to the states ^ G E (the 
values > being fixed independent of N). 

We will generally write Hj for the configuration of all the channels in such 
a stochastic model; this is in the state space e'^'~^^^° . If we want to make N 
explicit, we include it as a superscript, as in h|^^ and Vf). 

Our deterministic model arises heuristically as the limit of the stochastic 
model with very many very small ion channels; that is, for large N. In the 
deterministic model we introduce a new family of functions, £ Lip(/, [0, 1]) 
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for £ E, that replicates the role of the individual-channel configurations 
E E E^n7V/° _ The value p(.{x) is to be interpreted as "the proportion of those 
channels in a small neighborhood of the point x that are in state ^" ; we will 
see that at all times J2s,eEP^ — 1- 

We will write p^^t for the proportion functions at time t\ these should all 
evolve following continuous paths in Lip(/, [0, 1]). 

Remarks on notation. Henceforth we will use the Sobolev spaces 
-f^Q and without further comment. Many good treatments of Sobolev 
and other function spaces are available in standard texts on PDE; see, for 
example, Chapter 5 and Section 7.1 of Evans [7]. 

Given a nonnegative integrable function / G L^{I) and writing /_i for 
Lebesgue measure on I, we denote by /il/ the indefinite-integral measure: 

We will also sometimes regard such measures as bounded linear functionals 
on one or other function space. Given a function g in such a space and a 
functional ^, we write {g,fi) for the evaluation in the obvious way. 

We will write D for differentiation of differentiable functions on / and A 
for the one-dimensional Laplacian D^, and will use the notation xe for the 
indicator function of a set E. 

2.2. The deterministic equations. Henceforth suppose that we have vq G 
Hq with <Vq < and a family (p^,o)c6£ of Lipschitz functions / — > [0, 1] 
such that X^^esPS.o = 1 everywhere; these are our initial conditions for the 
deterministic model described in the previous subsection. We also fix now 
and for the rest of the paper a finite but arbitrary time horizon T > 0. 

We are now ready to make the following definition. Note that we are using 
implicitly a suitable notion of weak solution for our PDE (since we ask only 
that the time derivative of the trajectory be in the space of functionals H~^). 

Definition 2.1. A continuous function v: [0,T] Hoi^) ^ family 
(P?)ce-E of continuous functions : [0,T] Lip(/, [0, 1]) will be said to sat- 
isfy the generalized deterministic Hodgkin-Huxley equations (D) with initial 
conditions vq, p^^ if 

• (Regularity) 

|p5GL^(,)[0,r] y^€E; 
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• (Dynamics: PDE) 

^^rt = Avt + ^ C5P5,t • {v^ - vt) Vt G [0, T\ 

[we will refer to this equation as (D-PDE)]; 

• (Dynamics: proportions) 

ipi,t = (("C,C ° vj) • p^,j - (ag,^ o Vt) • p^^t) G t£ [0, T] 

C6S\{C} 

[we will refer to this system of equations as (D-prop)]; 

• (Initial conditions: PDE) 

vo = ^^o; 

• (Initial conditions: proportions) 

• (Boundary conditions: PDE only) 

vt(±^) = o vtG[o,r]. 

Remark. It follows at once by adding the relevant differential equations 
that the sum J2^£EPS.,t constant, and so is always equal to 1 everywhere; 
this means we remain safe in our interpretation of p^ t as the proportion of 
channels in a particular state. 

2.3. The stochastic equations. We carry over the PDE initial condition 
vq from the previous subsection, but now also assume given G £;^riAf/ ^ 
the initial configuration of individual-channel states in the A^th stochastic 
model. 

Definition 2.2. Suppose that {Q,J^,{J^t)o<t<T,'^) is a filtered proba- 
bility space satisfying the usual conditions. Given a pair (Vt,3t)o<t<T of 
cadlag adapted stochastic processes such that each sample path of V is a 
continuous map [0,r] H^{I) and St is in ^^nAf7° f^j, ^ ^ [Q,T], we 
will say that they satisfy the Nth stochastic Hodgkin-Huxley equations (Sat) 
with initial conditions vq, if 

• (Regularity) The map t>-^ lies in L^_i^y^^ [0, T] almost surely; 

• (Dynamics: PDE) 

ieznNi° 

Vt G [0,r], P-a.s. 

[we will refer to this equation as (Sat-PDE)]; 
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• (Dynamics: jump) 

nSt+hii) = C I St{i) = 6 = a/:,^(Vt{i/N))h + Ohio{h) 

vtG [o,r),/iG (o,r-t], 

with the coordinate processes (St+h{i))h>o independent to first order in 
h as h [0 conditional on J^t [we will refer to this system of equations and 
conditions as (Sat -jump)]; 

• (Initial conditions: PDE) 

• (Initial conditions: jump) 

So = So; 

• (Boundary conditions: PDE only) 

Vj(±£) = o vte[o,T]. 



2.4. The goal of this paper. Before we can state the main result of this 
paper we need a little more notation. For £ E we write jv for the map 
E^<^^^° ^H-\I) given hy 

C€,Ar(H) = — ^i/N 
i(^Zr\NI°, H{j)=5 

[the Dirac deltas (5j/7v readily interpreted as elements of H^^{I)]; so 
C^^jv(H) places a mass of on each point i/N G /° at which H is in 
state We refer to it as the empirical distribution for S^. We introduce the 
distributions C^^tv for each individual state ^ to meet the notational needs 
of the subsequent analysis. 

We are now ready to state the result: 



Theorem 2.3. Let e > 0, and suppose given initial conditions vq, p^fi- 
Then for any N sufficiently large, say N > Ni, there exists an initial con- 
dition Ho for (Sn) so that there is some ^^high-probability" J^i C f2 with 
P(r2 \ fli) < £ and such that 

sup ||VJ^^ - vtll^i(^) <e, 

0<t<T ' 

sup WC^^Ni^i^^) -Pi^Ah'Hi) 

0<t<T ^ ' 



on Qi. 
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Colloquially, this theorem tells us that as ^ co the stochastic ion- 
channel model of the axon gives a time-evolution of the potential difference 
along the axon that converges to that given by the deterministic model, 
uniformly up to a given finite time horizon, in probability. 

This theorem will be proved in Section 4. The overarching idea when 
proving theorems of this sort is often to find an inequality that bounds the 
growth rate of the deviation in terms of the values the deviation has taken 
thus far; or, by integrating this inequality, to bound the current value of 
the deviation in terms of some average of the values it has taken so far. 
The most common formalization of this idea, and the one we will rely on, is 
Gronwall's lemma; we quickly recall this here: 

Proposition 2.4 (Gronwah's lemma). Suppose T >0 and f :[0,T] 
is continuous. Suppose further that there are constants A,B > such that 



for all t G [0, T] . Then f{t) < Ae^^ for all t G [0, T] . 

2.5. The three scales of the models. One interesting feature of the stochas- 
tic Hodgkin-Huxley model is that it relates behavior on three distinct scales: 
the flow of charge at the scale of individual ions; the opening and shutting 
of ion channels at the scale of large protein molecules; and the working of 
the whole axon. 

The stochastic model of the ion channel is faithful at the second and third 
of these, but uses a simplified description of the behavior at the first — the 
smallest — as a continuum charge distribution. As is standard, the "random" 
movement of a rarefied distribution of very many very small particles in a 
suitable medium (here ions in solution) is simplified to a continuum evolving 
in time according to a parabolic PDE. 

In this sense the stochastic model "averages away" the random behavior 
at the smallest scale. However, it retains a detailed description of the inter- 
mediate scale, whereas the deterministic equation takes an average here also: 
the simultaneous states of a great many small ion channels are forgotten, 
with only the proportion of channels in each state within each small length 
of the axon being retained. 

Thus we can think of the difference between the stochastic and deter- 
ministic models as one of resolution: although neither model can "see" the 
individual ions, the stochastic model can see single channels, whereas even 
these are beyond the deterministic model. In this sense the main result of 
this paper is a check that if we average out over the smallest scale to obtain 
the stochastic model, and then consider a suitable limit of this to represent 
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the vanishing size of the intermediate scale, we recover the model obtained 
by averaging over both smallest and intermediate scales from the start. 

Some slightly unusual features of the stochastic model can be traced back 
to this three-scale property of the system under study. More common ap- 
plications of Markov processes to the modeling of real-world systems need 
consider only two scales, often corresponding to the smallest and largest 
of the above. In this simpler case the state of the full Markov process will 
typically describe the complete state of the system in terms of a (large) dis- 
crete collection of components, possibly distributed in space; this may then 
have a continuum limit (often deterministic, but sometimes still stochastic, 
depending on the regime) in which the small scale has undergone averaging 
and so only a single, large-scale picture remains. The many stochastic com- 
ponents in the full model are traded in for a more complicated large-scale 
description, often based on spatially variable quantities evolving following a 



However, because we obtain our stochastic model by performing only some 
of the possible averaging, not all, we are left with both a large number of 
stochastic components (the ion channels), and a complicated, spatially vari- 
able deterministic system following a PDE (the membrane potential along 
the axon) coupled to them. We will find that this occasionally puts the 
analysis of the stochastic model slightly beyond the reach of more routine 
results in either stochastic processes or PDE, and so just a little thought is 
needed to combine both disciplines and obtain useful results. We will see this 
first when proving existence for the stochastic processes, and again when we 
come to the estimates for the growth of various related stochastic processes 
that we need for proving convergence. 

3. Preliminary results. In the first two subsections below we discuss vari- 
ous general facts about a relevant diffusion semigroup and about the sample 
paths of finite-state-space Markov processes. We then move on to discuss 
existence and regularity for our equations (D) and (Sat). 

3.1. A diffusion semigroup. To prove our main existence and regularity 
results later in this section we will need some facts about the Feller semigroup 
{Pt)t>o corresponding to Brownian motion in I absorbed at the end-points 



This semigroup can help us because of the connection between diffusive 
PDE and Feller diffusion processes that allows us to rewrite (D-PDE) in the 
integral form 



PDE. 



of /. 
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and similarly for (Sat-PDE). In order to use this integral representation we 
first need to prove certain regularity properties of the semigroup. 

Lemma 3.1. Lety be in the interior of I. Then Pt6y is a smooth function 
on I vanishing at the end-points for any t > 0. Furthermore: 

1. there is some constant Ci > 0, depending on T but otherwise not on 
t £ [0,T], such that for any continuous function f:[0,t] ^ M we have 
that the function 



I 



.■.x> 



[ f{s)Pt-s6y{x)ds 
Jo 



is in Hq{I) and satisfies the estimate 

f f{s)Pt-s6y{-)ds 

Jo 



<Ci 



2. for any fixed e > there is some constant C2(e), depending on e and 
T but otherwise not on t £ [0,T], such that for any continuous function 
f:[0,t]^R we have 



t-e 



f{s)Pt-sSyi-)ds 



< C2{e) 



t-e 



\fis)\ds. 



for any t £ [0, T] . 



Remark. Note that the imposition of a fixed e > in the second esti- 
mate is necessary; without it the result can be made to fail for any given 
choice of C2 by choosing / to be zero apart from in {t — rj,t), where it rapidly 
becomes very large, for some sufficiently small 77 > 0. 

Proof of Lemma 3.1. By additivity we may assume / > 0. One- 
dimensional Brownian motion has the transition density 



Ptix,y) 



1 



/27rt 



-\x-y\y2t 



so that for / £ Cb(M) the corresponding Feller semigroup {Qt)t>o is given 

by 



Qtf{x) = / f{y) 



1 



'2TXt 



li{dy). 



Now our semigroup {Pt)t>o corresponds to Brownian motion absorbed at 
the end-points of / (see, e.g.. Chapter 24 of [14]). This semigroup has the 
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modified transition density 

Pl{x,y) =pt{x,y) -E^XPt~T{Wr,y)x{T<t}) 

= Pt{x, y) - E^{pt-r{^, y)x{T<t,W'r=i}) 

- E^(pt_^(-^, y)X{r<i,VF,=-£}), 

where W is our Brownian motion and r is the hitting time of the boundary of 
/. Applying Pt to the Dirac point-mass 5y we recover precisely this expression 
for pl{x,y). That Pt6y is a smooth function vanishing at the end-points of 
/ now follows at once. 

It remains to establish the two estimates. From the above we have 

f f{s)Pt-sSy{x) ds = f f{s) } -\.-y\V2(t-s) 
Jo Jo V ^^(* ~ 

f{s)Ex{pt-r{i, y)X{T<t-s,w^=e}) ds 
f{s)E^{pt^s~T{-i, y)X{T<t^s,w^=^£})ds. 



It suffices to prove the desired regularity for each of these three integrals 
separately. That both of the estimates hold with suitable constants (the 
second depending on e) for the second and third integrals is clear, since y 
is fixed away from ±1 and so the expressions inside the expectations Ex are 
uniformly bounded functions of x with uniformly bounded equicontinuous 
derivatives as s varies in (0,t). 

We are left with the first integral, which we break into two pieces. 

First we estimate the integral over {t — e,t). We have 



Jt~e v/2vr(t-s) 



<ll/lloo r I e-\^-y\y'('-^Us, 

2 



and now, making the substitution s = t — 1/u , this is 

1/v^ \/2-KV? 



For any given e this is clearly a smooth function of since the integrand 
over (l/-y/e, oo) is dominated by l/u^. It is also clear that it converges to 
in II • 11^2(7), by dominated convergence; by nonnegativity the same is true 
of the original integral involving /. 
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We may also differentiate our integral over {t — e, t) with respect to x under 
the integral sign (except, possibly, aX x = y) to obtain the new integral 



which is bounded in absolute value by 



\x-y\ 



-\x-y\^/2{t-s) 



e V2^it - s)3/2 

Using the substitution s = t — l/u^ again this becomes 



V2vr 

Making the second substitution u = w/\x — y\, this becomes in turn 

\/2tT J\x-y\/y^ 

[noting the cancelation of two factors of (x — y)], which is bounded as x — > y 
and so is also in L'^{I) as a function of x. Another appeal to the Dominated 
Convergence Theorem completes the proof that this tends to zero in || • ||l2(j-) 
as e — > 0. 

Therefore for any r/ > we can choose e > so small that 

f f{s)Pt-sSy{-)ds 

f f{s)Pt-s5y{-)ds 

Jt-e 



< 



+ 



L2(/) 



D 



f{s)Pt.sSy{-)ds 



< 



oo 



V 27rn- 



1 ^n'\(-yy\V^du 



+ 



1 



2vr J\{-)-y\/V^ 



L\I) 



1 



Having done so, the uniform boundedness and smoothness properties of 
Pt-s5y{x) (as a function of x) for s bounded away from t give at once some 



S > such that if i < T and 

rt-e 



Li[0,T] 



< 5, then 







f{s)Pt-sSy{-)ds 



For suitable r] this is just the second estimate that we wanted, and adding 
to our previous inequality gives also the first estimate. □ 
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3.2. Finite- state- space Markov processes. We recall here some facts about 
Markov processes with a finite state space S and the corresponding space 
of sample paths Ds[0, oo); or rather, as we will need, the space Ds[0,T] of 
paths taken only up to (and at) a finite time-horizon T. This space of paths 
is treated comprehensively in Chapter 3 of Ethier and Kurtz [5] . 

For any complete, separable metric space S the space Ds[0, oo) of cadlag 
paths from [0, oo) into S is also complete and separable when endowed with 
its Skorohod topology; in particular, this is so if S is a finite set with its 
discrete metric. The same is clearly true of our space Ds[0,T]. We will write 
TTt for the time-i projection map Ds[0, oo) — > 5 : w i— > wt. The process {7rt)t>o 
is referred to as the canonical process, and defines the canonical filtration 

J't = a{{Trs:s<t}). 

In the case of 5 a finite set each path lo G Ds[0,T] is completely charac- 
terized by the following data: 

• the total number N{uj) of jumps performed by the path (this is always 
finite); 

• the sequence of numbers crj^Lo) > 0, j = 1, 2, . . . , N{uj), giving the time of 
the jth jump (for convenience we also set uo = 0,o"^((^)+i = T); 

• the sequence of states ^j, j = 0, 1, . . . , N{lo), giving the starting state for 
j = and the landing state 1^0-^ (0;) after the jth jump for j >1. 

It is clear that all of the above define measurable (in fact, continuous) func- 
tions on Ds[0,T]. 

When we come to construct a solution to our stochastic equations, we will 
need the following explicit computation of the absolutely continuous change 
of measure on the path space Ds[0,T] implied by the Girsanov theorem 
in the context of Markov processes with finite state spaces; in this sense it 
is an analogue of the Cameron-Martin theorem. The required theory and 
calculations are treated in Chapter III, Section 5 of Jacod and Shiryaev [13] 
(in a rather more general setting). 

Lemma 3.2. Suppose that Pi is a probability measure on Q = Ds[0,T] 
for which the canonical process {T^t)te[o,T] ^ Markov process with all jump 
rates equal to 1; and suppose also that for each ^,C, ^ S, Ct^C; given 
a progressively measurable function X^i^:Q x [0,T] — > [0,oo) with X^^^(io,-) 
continuous for every cu. Define h:0, by 

^ Uf=0^ exp(- /"'J'/''^ \,(-)('^' C^s)A5,.(c.),C,+i(a;)('^, 

h[uo) = HvM 

nj=o exp(-(cjj+i(a;) -cjj(a;))) 

= e^ n exp - / Ag^,(^)(a;,s)(is A5^.(^),5^,_^,(^)(a;,f7j+i(u;)) 

j=0 ^ J(Tj{u)) / 
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and ¥ = ¥i\_h. Then under ¥ the canonical process is a time-inhomogeneous 
Markov process with jump rates \^^^{uj,t) for t £ [0,T], and P is the unique 
probability on Q with this property. 

3.3. Existence and uniqueness for the deterministic equations. 

Proposition 3.3. There is a unique weak solution to (D), and it sat- 
isfies w_ < < for all t. 

Proof. This is a classical example of the use of fixed-point theorems 
and Gronwall's lemma in the study of nonlinear parabolic PDE, and we will 
not describe it here (we will see a more complicated example in the exis- 
tence and uniqueness result for the stochastic equations anyway). A thor- 
ough and readable treatment is Lamberti's [16], although the first existence 
and uniqueness results are for strong solutions and go back to Evans and 
Shenk [6]. Note that both of these papers give an analysis specific to the 
original Hodgkin-Huxley equations, with particular forms for the states ^ 
and proportions p^; the method of analysis, however, extends to our case 
immediately. □ 

3.4. Existence and uniqueness for the stochastic equations. Concerning 
the stochastic equations, our later proof of convergence will need only a 
suitable form of weak existence of solutions, and not uniqueness. However, 
it seems only natural to include proofs of both existence and uniqueness 
here. 

In constructing the process (V, H) we will need to introduce a particular 
underlying filtered probability space {^l,J^,{J^t)o<t<T,^), even though our 
later convergence results hold for a suitable process on any such space. We 
will choose the probability space with some additional structure that allows 
us to interpret any u; G as a driving signal from which we can (almost 
surely) construct a corresponding sample path of our desired process. Thus, 
the choice of a particular filtered probability space, and the subsequent con- 
struction of a probability on it, can be thought of as the choice of how to 
mimic the randomness apparent in the real-world system. 

Now, the process we want takes values in the overall state space Hq (/) x 
j^zr\Ni° ^ [g far from locally compact, and so the above construction 

is not contained within the standard machinery of Feller processes: we will 
need to construct our process with a little more care. We remark that this 
lack of local compactness is an artifact of our model's two different "small" 
scales (see Section 2.5): the function space arises as a result of the averaging 
over the "very small," and so leaves us to cope with the infinite-dimensional 
topology of that function space, while we still want to model the "fairly 
small" scale stochastically. 
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The key to our construction is to observe from the dynamics of (Sat) that, 
if we already knew the final form of the process H, then for a fixed sample 
path of H the evolution of the corresponding sample path of the process 
V would be deterministic. Happily, regarded just as a PDE, (Sat-PDE) is 
a nonlinear parabolic equation, and at the core of the theory of these is a 
standard procedure for proving existence of solutions. It relies on a handful of 
classical fixed-point theorems for Banach spaces and a few ways of choosing 
how to apply one of them to a suitable function space. We will use a slight 
modification of that procedure applied u-hy-uj. 

To do this, we again convert the PDE to an integral equation: 

1 /■* 

i&r\Ni° 

where {Pt)t>o is the Feller semigroup from Section 3.1. 

We will apply our chosen fixed-point result to the Banach space X = 
Cj|^i(-jj[0,T]; the required differentiability properties of the function V will 
then follow from the integral equation. It is worth commenting on this choice 
of space. For many applications in PDE the larger space Ci2(j)[0,T], with 
its less restrictive topology, would be the appropriate choice. We are forced 
to work with the smaller space by the slightly unusual nature of our PDE: 
the right-hand term contains a linear combination of Dirac measures, and 
so is not itself a function but only a member of H~^{I). It will turn out 
when we construct our map from X to itself below that the smoothing 
properties of the heat semigroup are not enough to give a continuous self- 
map of C^2(/)[0,T], and so we are forced to work with the more complicated 
space. 

Motivated by these observations, we will attempt to construct (V,3) 
on the space = D^znNi°[0,T] of cadlag paths from [0,T] into e'^^^^° , 
regarded as a filtered space as described in Section 3.2. We will start by 
defining a "simple" probability Pi on this space, and will later obtain the 
desired probability P as a suitable indefinite-integral measure with respect 
to Pi, using Lemma 3.2. This use of the path space makes more concrete 
the above-mentioned driving signal interpretation of u; G 17. 

We choose our probability Pi by specifying that under it the canonical 
process {T^t)t&[o,T] is a Feller process with all jump rates between different 
configurations in £'^riAfi'° gq^g,! to 1 (the existence of such a probability is 
guaranteed by the usual theory of Feller processes). 
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Proposition 3.4. FixN> 1 and let the filtered space {i^,J^,{J^t)o<t<T,^) 
be as above. Then the space carries a solution to (Sn ) with initial conditions 
vq and Hq, and the law of this solution is unique. 

Proof. The proof strategy is as follows: 

1. Obtain from the integral form of (S at-PDE) an equation for the trajectory 
(Vt(ti;))o<t<T for a given input signal lo £ D^EnNi°[0,T]. This equation 
will then be of the form V(a;) = ^(\'{uj),uj) for a suitable map ^'.Xx 



2. Show that for each u> separately the map = ^'(•, w) :X ^ X: 

(a) is continuous; 

(b) has the compactness property needed for Schaefer's Fixed- Point The- 
orem (the appropriate fixed-point theorem for this proof, since we do 
not have any obvious contraction mapping); 

(c) has the boundedness property needed for Schaefer's theorem; 

and so obtain a nonempty set of fixed points for every u; G 0. It is here 
that we will need our preliminary lemmas about the semigroup {Pt)t>o- 
At this point we will also show that the trajectory V(li;) is unique, given 

LO. 

3. Show that ^ -.X x Q ^ X is measurable, and hence that the set of 
pointwise fixed points {(U, w) : ^'(U, w) = U} is measurable and has a 
nonempty section above every u! £Q, and apply the Measurable Selector 
Theorem to give the function V. 

4. Having thus obtained a suitable trajectory V(u;) for each uj, varying 
progressively measurably with the sample path 

can use Lemma 3.2 to write down a suitable Radon-Nikodym derivative 
for a new probability P with respect to Pi so that under P the equation 
(Sat -jump) is also satisfied. 

5. Finally, uniqueness will follow from the uniqueness for the PDE corre- 
sponding to a single uj proved at the end of Step 2, and the uniqueness 
part of Lemma 3.2. 

Step 1. For each a; S D^znNi° [0,T] we need to solve the integral equa- 
tion 



we let '^(V{uj),uj)t be the expression on the right-hand side. 

Step 2. Fix lj. Recall Schaefer's Fixed- Point Theorem (see, e.g.. Theo- 
rem 4 of Section 9.2.2 in Evans [7]): 



n 



X. 
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Schaefer's Fixed-Point Theorem. Suppose that X is a Banach 
space and that ^ iX ^ X is a continuous map that converts bounded se- 
quences to precompact sequences. Assume further that the set 

{u£X:u = X^u for some A G [0, 1]} 

is bounded in X. Then ^ has a fixed point. 



We will check these relevant properties separately for ^ = We write 
as 

ieznNi° 



where 



^o.A'^)t = I c^4i)iVuj4i)-^s{i/N)){Pt^sSi/N)ds. 





Step 1 (Continuity). It suffices to prove this for each ^ujA, for which it 
will follow from Lemma 3.1. Suppose U, W E X. Then 

ft 

(Ca;4i)(t'a;4i) " Vs{i/N)) 

Cuj4i)ivu;,{i) - ^s{i/N))){Pt-sSi/N) ds 
c^,i^){Ws{i/N) -Vs{i/N)){Pt.sS./N)ds. 
Since by Poincare's inequality the norm 

sup \\Vt-Wt\\m(i) 

0<t<T ° 







is stronger than 



sup ||Ut-Wt||oo 

0<t<T 



to within a multiplicative constant, it follows that by selecting the former 
sufficiently small we may make the multiplier c^j(i)(faj3(i) — Ws(i/A^)) of 
Pt-s^i/N uniformly small in the above integrand, and so make the norm 

as small as we please uniformly in t G [0,T], by Lemma 3.1. 
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Step 2 (Compactness). Here also it suffices to consider each ^^^j sepa- 
rately. Compactness now follows directly from the estimates of Lemma 3.1, 
which allow us to approximate the integral expression by a linear combina- 
tion of the functions Pgdi/^ for s taken from some sufficiently large finite 
subset of (0, t). 

Step 3 (Boundedness). Suppose that \J £ X has U = A^'a;(U) for some 
A G [0, 1]. Writing this out more fully, it reads 

Vt = XPtVo + X— / c^sii){vus[i)-^s{i/N)){Pt.sSi/N)ds. 



Hence 



I Ut 1 1 Hi (7) 

<A||Pt^;o||^i(,) 
N ^ 



ieznNi° 



,{i){vujs{i) - Vs{i/N)){Pt-sSi/N)ds 



1 /■* 

^y-'Wr^AT TO -^0 



i&nNI 



H'oil) 



N ^ 



iemNi° 



CujM)^s{i/N){Pt^sSi/N) ds 



Rather than try to bound the growth of ||Uf Hj^i^j^ directly using the above 
inequality, we consider the maximal function u{t) = maxo<s<t ||Ut||j:^i(-/). By 
Lemma 3.1, the ffi'st of the above sums is bounded by a fixed constant (since 
there are only finitely many possible values for c^^^ and v^), which can be 
chosen independent of t G [0,T]. For the second sum, we break the integral 
in each term into two pieces: an integral over a small interval (t — e, t) that 
we can bound using the length e of the interval, and another over what 
remains that we can bound because Pt^s^i/N is more regular there. This 
idea is similar to that in the proof of Lemma 3.1. We select e so small that 



max I cc I 



t 

t-e 



Pt-sSi/N ds 



^ 1 



and so, using Lemma 3.1, 



Cu;Ji)^s{i/N){Pt^sSi/N) ds 



< lu{t). 
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Now the second estimate of Lemma 3.1 gives 

/ c^Ji)'^s{i/N){Pt^s^i/N)ds < C2(e)max|c^| / ||U^||oo(is 

Jo Hl{I) JO 

< CC2(e) max Ictl / u(s)ds, 

where C is the constant so that || • ||oo !^ C*!! • ||_f/|J(/)) guaranteed by Poincare's 
inequahty. Reassembhng the above inequahties, we find 

||Ut||j:^i(n < A + AB / u{s)ds + -u{t), 
' Jo 2 



where 



1 /■* 

A = A||Pt7;o||j^i(/) + A— sup / c^^{{)V^^^^i){Pt_sSi/N) ds 

i^7/r^AnoO<t<T Jo 



ieznNi° 

and 



B = CC2 (e) max I cc I 



and so 



u{t) <2A + 2B [ u{s)ds 
Jo 

(as A < 1). Now Gronwall's lemma gives a fixed bound on u{t) for t G [0, T], 
and so we have the desired bound in the space X. 

Completion of Step 2. Thus the set Y^j of fixed points of "ifu is 
nonempty for every lo £^1. 

We also wish to show that the solution to our PDE (or, equivalently, 
integral equation) for a fixed uj is unique, that is, that 1^ is a singleton 
for every w G $7. This is an easy calculation; for suppose (u) = ^^(V^), 
j = 1,2, are two solutions for some to. Then, taking the difference, we find 
that U := V^(u;) - V^{u;) satisfies 

Repeating the trick of breaking (0,t) into (0,t — e) and {t — e,t) and 
applying Lemma 3.1, as used in Step 3 above, we now obtain the estimate 



\^t\\H^(i)<B u{s)ds + ^u{t), 
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where once again u{t) =maxo<s<t ||Uf||j:^i(j). Hence 




Jo 



for all t £ [0,r], and by Gronwall's lemma u = 0, and thus V^(a;) = V'^(a;). 
Step 3. Now set 



Since the map X x Q ^ X sending (U,a;) to ^(\J,uj) — U is measurable, 
so is the set Y; and by Step 2, all of the sections Y n {X x {w}) = 1^ are 
nonempty. X is a separable Banach space, and so the Measurable Selector 
Theorem (for a suitable version see, e.g.. Theorem 10.1 in the Appendix 
to Ethier and Kurtz [5] ) guarantees a measurable function V : f2 — > X with 
V(u;) G Y^ for all uj. 

Step 4. Now we can obtain P as a suitable indefinite-integral measure 
with respect to Pi so that (Sjv-jump) is satisfied. This follows from Lemma 
3.2 by taking the global jump-rates in E^^^^" that correspond to the inde- 
pendent jump-rates of single components given by (Sat -jump); that is, for 
distinct configurations Hi,H2 G E^'~^^^° , 



In order to apply the lemma, we need only note that the sample paths 
of the process Vj are continuous and depend only on the evolution of the 
jump process up to time t [from the form of the integral solution to (Sjv- 
PDE)], and so the rate processes Xei,e.2 stre continuous and progressively 
measurable. 

Step 5. It remains to prove uniqueness in law. Suppose now that (V^, 
3f )o<t<T is some solution to (Sat) on some abstract filtered space (f]'^,^'^, 
(^°)o<i<T,P°). Then, from (Stv-PDE), we know that 



for all t £ [0, T], P'^-almost surely. From the end of Step 2 above we know that 
this equation specifies V'^ uniquely, given the sample path 3'^, and therefore 



y = {(U, cj) G X X $7 : U is a fixed point for 
= {(U,u;) GX X 17:^(U,cu) - U = 0}. 




if Hi, H2 differ in two coordinates 
or more. 
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V° = V(H°), P°-almost surely. It now follows from (Sat -jump) that the law 
of H'' under P*^ must have the same inhomogeneous Markov property as 
described for P in Step 4, and so, by the uniqueness part of Lemma 3.2, 
these probability measures must be equal. The result follows. □ 



3.5. Three additional regularity results. In this subsection we collect three 
additional results that will be needed later. 
The first is almost immediate. 



Proposition 3.5. Suppose vq G Hq{I), <vo <v+, and consider (D) 
with initial conditions (t^Oj (p^,o)^e-E) (^nd each (Sn) with initial conditions 
(uo,Hq^^). Then there is some constant C3 > (independent of the exact 
initial conditions and of N) such that 

l|Vi||oo,||V(^)||^<C3 

for all N>1 and all t G [0, T] . 



Proof. It follows from Proposition 3.3 that the constant maxgg£;|u^| 
itself works for the deterministic trajectory v; it will therefore suffice to 
find a constant that works simultaneously for all of the stochastic equations. 
However, we have already proved this while establishing the boundedness in 
Step 3 in the proof of existence for the stochastic equations, for the constants 
A and B used there did not depend on N, t £ [0,T] or uj and we may just 
set A = 1 . This completes the proof. □ 



Remark. In fact it seems intuitively clear that C3 = max^g e\v^\ should 
work for the stochastic equations also, but I have not proved this. 



The next result tells us a little more about the regularity of the map v: 



Lemma 3.6. Suppose vq andp^, £ E, have common Lipschitz constant 
K < 00, and that v and are a solution to (D). Then 

sup ||Dvt||oo < 00. 
0<t<T 



Remark. An bound such as this seems a little odd for a PDE for 
which existence of solutions is most naturally studied in the Sobolev space 
Hq{I), and which has a distinctly quadratic flavor owing to its diffusive 
nature. We will need this L°° estimate later as a consequence of the very 
specific form of the nonlinear term in (D-PDE). 



STOCHASTIC HODGKIN-HUXLEY EQUATIONS 



23 



Proof of Lemma 3.6. We observe first that since v: [0,r] Hq{I) is 
continuous, so is the function from [0,T] x / — > M defined by {t,y) ^ vt(y). 
Therefore we do have 



sup ||Vi||oo < OO, 
0<t<T 



and hence also 



sup 

0<t<T 



< OO. 



Now the proof makes another use of conversion to an integral equation, 
this time for (D-PDE): 

Vf = Ptvo + [ Pt~s \ y] ctpt s • {vi - Vs) ] ds 

= PtVQ + y2c^ [ -Pt-s(P5,s • {v( - v^)) ds. 

The easiest way to proceed is to appeal to the specific form for t > of 
the transition density Pt{x,y) corresponding to the semigroup {Pt)t>o, as in 
the proof of Lemma 3.1: 

pl{x,y) = -^e-\--y\'/^'-E,ipt.Ai,y)x{r<t.w.=i}) 

-'^x{pt~T{-i,y)x{T<t,Wr=-e})^ 

where r is the hitting time of the boundary of / for a standard Brownian 
motion. 

Using this, we can write for a continuous function / : / — > R and x G M 



Ptfix) = / /(y) 



1 



-\x-y\y2t 



f{y)^x{pt-T{^, y)x{T<t,w.,=i}) Kdy) 
f{y)^x{pt-T{-i, y)x{T<t,Wr=-i}) Kdy) 



f{y)- 



/27rt 



-\x-y\y2t 



H{dy) 



lEx / f{y)ipt-Ti^,y)x{T<t,Wr=e})Kdy) 



f{y){pt-T{-^, y)x{T<t,Wr=-i}) Kdy) 
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(where the second rearrangement follows from Fubini's theorem), and there- 
fore 



f Pt-sf{x)ds 

Jo 



f{y) ] e-\^-y\'/'('-^^f,{dy)ds 
J I V 27r(t — s) 

~ (i) / f(y)(Pt-s-^(^^ y)X{T<t-s,Wr=t}) l^{dy) ds ^ 
- Ex- {J^ f{y){pt^s~r{-^, y)X{T<t-s,w^=-i}) f^idy) ds^ . 

We can differentiate the first of these three terms with respect to x under 
the integral sign to give 

and now can see that this is bounded directly, since 

^ 'Jo V2^(t-s)3/2 

is bounded as a; — > y by the same fixed estimate as appeared in the proof of 
Lemma 3.1. 

Bounds for the second and third integrals are proved by using just the 
same estimates inside the expectations Kx, and the explicit form of the 
hitting time r. The result follows. □ 

The last result in this subsection gives us bounds on the spatial derivatives 
of solutions to Stv that are independent of N, giving us a certain delicate 
control that will be needed later when proving convergence. 

Proposition 3.7. Suppose vq G Hq{I), V- <vo< ?;+ and consider (D) 
with initial conditions (vq, (p5,o)5g£) '^''^'d each (Sn) with initial conditions 
(uOjHq^^). Then there is some constant C4 > such that 

||Dv,||i.(,) ds, 1^ II )||^.(,) ds < C4 

for all N>1 and all t G [0, T] . 
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d ||vW||2 

2(Vr),Avf) 



\ i&zr\Ni° I 



and so, rearranging and integrating, 



1 /■* ||VW||2 



+ It I max Cf I Cs C3 + max \vc I 



< C| + £T (^max j C3 (^^3 + max | | j ; 

now this right-hand side is a constant independent of N . The same reasoning 
apphed to (D) gives another constant independent of N , and so we may take 
C4 to be the larger of these two constants to give a simultaneous bound on 

J^WDvsWl^^j^ds and j\\DYf^\l,^j^ds. □ 



4. The main convergence result. 



4.1. A decomposition. Suppose that (v,(pg)^g£;) is a solution of (D) and 
that (Vj,Ht)o<j<T is a solution of (Sat) for some A'" > 1 (in this subsection 
we largely suppress N in our notation, although it will be retained later 
when we consider more than one value of N at once) . 
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For each ^ & E we will need to consider the process of differences (Cg^7v(Ht) — 
^J'\-P(,t)o<t<T taking values in H^^{I). We will decompose this as follows: 

C5,Ar(Ht) - /iLP5,t = C5,Ar(So) - ^li-P^fl + / Q^^s{^s,^ s) ds + M^^t, 

Jo 

where 

= ^ E E ('5c,HW«c,«(^(W)-'5€=w«€,c(^(W))<5viv 

and the above relation is taken as the definition of the process {M^^t)o<t<T- 
We note that we have defined Q^^s{3,V) for arbitrary H, V, but that the 
functions p^^t are taken as given, and so play a part in the definition. 

For the purposes of this paper, we will refer to the integral of V^) 
as the finite variation part of this difference process and to as the mar- 
tingale part. These names are motivated by the analogy with the definition 
of a Stroock-Varadhan martingale arising from a Feller process, and are jus- 
tified by Lemma 4.1 below. However, as we have already remarked, here the 
underlying state space is not locally compact, and M^^t takes values in the 
space of functional H~^, so we need to be careful about what we mean by 
martingale. 

Lemma 4.1. Suppose cp is a bounded measurable function on I and con- 
sider the process 

This decomposes as 

+ / ((^,Q^,,(H„V,))ds + ((/.,Mg,i) 
Jo 

and {{(j), M^^t))o<t<T is an [J^t)o<t<T -adapted cddldg martingale. 

Proof. Although this result can be made to fall under the general the- 
ory of the Stroock-Varadhan martingale (see Proposition 1.7 in Chapter 4 
of Ethier and Kurtz [5]), we give the calculation here. The cadlag property 
is clear. Suppose t G [0,T) and /i G (0,T — t]; then 
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rt+h 



^ 'e(((/.,Q5,,(3„V,)) I 



Dividing by h, letting /i | and comparing with (D-prop) and (SAr-jump), 
we see at once that the derivative 

exists and equals 0; but now we can apply the Dominated Convergence 
Theorem for conditional expectation to deduce that for any ho <T — t, 



,t+ho+u 



h=h(j 

' ) = 0, 



M=0 



and so E((0, f+^J | !Ft) does not depend on h and therefore equals (0, M^^t) 
for all /i. □ ' 

Before leaving this subsection, we recall some of the standard machinery 
of jump measures and compensators in the context of our decomposition. A 
general treatment can be found, for example, in Chapter 22 of [14]. 

Given the sample paths i^t)o<t<T) we define for i G Z n NI° the random 
jump measures ki on (0,T] x by 

te(0,T], Hi(i)^Hi_(i) 

and also the associated compensators ui (also measures on (0,T] x E) by 
Vi{dt,dy)= E ast_{i),c{^t{i/N))5^^ydt. 

CGi?\{3t-«} 

Given these, we can now rewrite 

Qti,s{^,V) = ^ (f {6^^y-6^^s^_(^,))ui{ds,dy))6,/N 



and can express the martingale part of our decomposition as 

^«'* = 4 il {^^,y-h,Ss-{i))i'^i-'^i)ids,dy))6i/N 

„■^■7r^^rro \J(0,i\xE J 



iemNi° 
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(note that the do not enter here at ah). In particular, for </> as in Lemma 
4.1, we have 

{^^M^,t) = ^ E / Hi/N)i6i,y-S^^s^_^,))iKi-Ui){ds,dy). 
i6Zn7V/°-^(o,i]xi=; 

The reason we have given these details is so that, when we next use this 
decomposition in Section 4.3, we can call on the following standard result: 

Lemma 4.2. With 6, above we can evaluate the L'^-norm of 

{(j),M^^t) thus: 

4.2. The plan of campaign. There is a standard theory of convergence 
and characterization of Markov processes. This is well developed for Feller 
processes with a locally compact state space, in which case it relies on con- 
vergence criteria for the generator of the corresponding Feller semigroup, 
but becomes much more complicated and less applicable in more general 
metric spaces. A thorough account of both cases can be found in Ethier and 
Kurtz [5]; in particular, the various more general convergence theorems are 
given in Section 4.8. 

In this paper we use more hands-on estimates to prove our desired form 
of convergence; given our large state space, I do not know whether the 
result could be proved by verifying enough conditions to apply one of the 
above-mentioned more general convergence theorems. The ideas we will use 
are motivated by a treatment of Kurtz's theorem which deals solely with 
explicit bounds on norms and probabilities, developed in the first instance 
for the case of fluid limits of pure jump processes, as described in [4]. This 
is due to Darling and Norris; Kurtz's original argument can be found, for 
example, in Kurtz [15] . In a sense, we follow an infinite-dimensional version 
of the Darling-Norris argument in a function space; this is made possible 
by the smoothing diffusive properties of the time-evolution PDF that takes 
the place of the ODF in their theory. 

Our plan of campaign for proving our main theorem is as follows: 

1. Decide which quantities should converge to the deterministic behavior 
as — > oo, and (importantly) in what sense they should converge: we 
will be working mostly in certain function spaces and their duals, and 
the desired convergence will hold only in the appropriate topology. The 
relevant choices are explicit in the statement of Theorem 2.3: we measure 
the deviation in the potential difference functions vl^-* and by the 
i^o-norm 

- '^t 1 1 //!(/) and the deviation in the channel states by 



STOCHASTIC HODGKIN-HUXLEY EQUATIONS 



29 



the H~^-noT:m ||C^^Ar(H^^-') — /^i-P5,t||_f/~i(/)- These particular choices are 
not uncommon in the study of deterministic PDE. Their motivation is 
certainly partly that they capture the relevant sort of convergence — the 

process C(^^n{s['^^) — fiL.p^^t can converge only in a fairly weak sense, since 
a linear combination of Dirac measures can be "close to" an absolutely 
continuous measure only in a weak sense — but it is also important that 
we can calculate using these particular norms very easily. We will see this 
in Section 4.5. 

2. Having decided how to measure the deviations of the stochastic evolution 
from the deterministic, we will work (quite hard) to prove either absolute 
bounds or Gronwall-like growth conditions on those deviations by using 
properties of the equations (D) and (Sn) and of the function spaces in- 
volved. In fact we will prove such bounds for three different processes. 
First we prove an absolute bound (in probability) on the i?~^-norm of 

the martingale part of the difference processes C^^j\r{s['^'^) — ^ULp^^t. 

3. Next we bound the growth of the H~^-novm of the finite variation part 

of the difference processes C^^Ar(H|^'') — /iLp^^f. 

4. Finally we bound the growth of the difference of potentials V"!^'* — vt. 
In fact, most of the work will go into bounding the L^-norm ||v|^'* — 
vt||^2(/), and then combining this with the bounds on the different parts 

of C^^]\[{s['^^) — /ULp^^t to enable an application of Gronwall's lemma. 
Only after an analogous version of the main theorem has been proved 

with this weaker L^-estimate on vj^'* — will we bootstrap our results 
to give the desired iJg-bound; this will follow from standard properties 
of the semigroup {Pt)t>o- 

This plan will be executed fully in the subsections that follow. 
4.3. Bounding the martingale part. 
Lemma 4.3. For any cj) £ L^{I) we have 

E((</.,A%i)2)<8^max||a5,^||oo^|^t 

for allte [0,T], N>1. 

Remark. It is for the proof of this result that we introduced jump 
measures and compensators in Section 4.1, as we find here that Lemma 4.2 
makes our lives very much easier. 



Proof of Lemma 4.3. As in Lemma 4.2: 

/ ■ 

l{0,t]xE 



E((</.,M5,i)2) = -l J2 I ct>{i/Nf{5i^^y-5^^^^_(^fn{ds,dy). 



i&Zr\NI 
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Substituting our definition of fj, tliis becomes 



E((<A,M5,t)') 





and now the given bound is clear by inspection. □ 

Lemma 4.4. Fix C > and suppose T > and e > 0. Then for any 
sufficiently large N, say N > Ni, we can find a subset Qi of Q such that 



on all of 0,1. 

Remark. This lemma tells that for sufficiently large N the martingales 
{ip, M^) for ij: of the form described can all be controlled simultaneously with 
high probability. We will prove this by using estimates on this martingale 
for finitely many individual functions, and then approximating an arbitrary 
function by combinations of these. 

This is possible because, by the previous lemma, we can control the size 
of {tpjM^^t) if we know only the uniform norm HV'lloo of ^p^, but any bounded 
subset in the Sobolev space Hq{I) is compact for the uniform topology 
on C{I). [Indeed, as is well known, Hq{I) embeds continuously into the 
space of functions on / that are Holder- ^ continuous, and so we have the 
equicontinuity needed to apply the Arzela-Ascoli theorem.] This means that, 
in the uniform norm, we can approximate the whole of any bounded subset 
of Hq{I) with only finitely many of its members. 

Proof of Lemma 4.4. Let E be the set of ip satisfying the stated 
bounds. Since {{—tp), M^^t) = —{'^■,^(,,1) and ^ G £" if and only if —ip G E, 
it suffices to prove the above with the last inequality replaced by 
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Since {M^^t)o<t<T is a martingale, replacing e by | and using Doob's 
martingale inequality shows further that it actually suffices to prove the 
inequality 



Step 1. Since we can bound C^^nCS), n\-P^^t and Q^^sC^jy) in 
independently of N and t £ [0,r], we can choose some i] > such that 
whenever HV'lloo < 1]^ then also (^,M^ j)^ < |. 

Step 2. Use the above-mentioned compact embedding to choose finitely 
many (pi, (f)2, ■■■ ,4>k ^ E so that any tp £ E has HV' ~ ^illoo < V for some j <k. 
Applying Lemma 4.3 to each of the functions (pi, we can choose A^i > 1 such 
that if iV > iVi , then we have 

for all t £ [0,T] and for 1 <i <k. It follows from Chebyschev's inequality 
that 

F(^{(f>i,M^^tf >^ for some 1 < i < k,t £ [0,T]j < e; 

set 

ni = l^{(j)i,M^^tf < I for alll <i < k,t£ [0,r]|. 

Step 3. Now let ip £ E, and choose j < k so that {{ip — (pj\\ < rj. Since 
{tp — (pj,M^^t)'^ < |, on we must have 

(V',M^,t)2 < (KV; - (P„M^^t)\ + \{(P„M^^t)\f < + = e. □ 

Corollary 4.5. Suppose T > and 5, e > 0. Then for any sufficiently 
large N , say N > Ni, we can find a subset ili of such that 

and 

sup ||M^,t||//-i(n < 5 

0<t<T 

on all of r^i. 
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4.4. Bounded growth of the finite variation part. In this subsection we 
will start to tie together the processes C^^n{s[^^) — /ULp^^t and v|^'' — vt 
(retaining now the superscript N). 

Lemma 4.6. With the notation of the start of Section 3, there is a con- 
stant C5 > independent of N such that for every ^ G -E, H G E"^^^^ and 
VGHUI) 

\\Q^,t{^,v)\\H-m) < c,{i + 11^11^1(7)) E \\C(A^,v) - /xLpctlU-H/) 

+ C5||y- Vt||i2(j) 

for allte [o,r]. 

Proof. Writing out more fully the definition of V) and expand- 

ing using (D-prop) we have 

_ 1 

~ N 



d 

4 E E {5c,Ei^)a(AVii/N))-^i,m»uiVi^/Nm^/N 

i£ZnNi° cgM{C} 
-/XL E (("«°Vi)-p^,j-(a5,^ovt)-p^,t) 



f4 E E ^cEi^)ac,dymm/N 

\ ieznNi° c&E\{^} 

- E /^L(a^,go Vt) -p^^t 

C6i?\{C} / 

- E ML(a5,c°vt)-P5,J 

E E '^C,H«"C,5(^(V^))'5i/A^ -/^L(a^,^ovt) -p^^j J 

Ce-E;\{5} ^ ieznNi° / 
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This has put Q^ i into a form with which we can work: we will now consider 
separately the individual terms in each sum. We will show the working for 
the first; the second is treated similarly. 
The term in question is 

^ E '^C,H(i)«C,?(^(V^))'5i/Af-/^L(a^,govt)-p^,t; 

let us call this Q^^c,t^-,^\ Suppose that Q G Hq{I) with < 1; then 

we find 

ieznNi° •'^ 

+ ^ 6* • ((a^,5 oV)- {a^^/: o vt)) • p^^j d//. 

The first line above is just 

{6 ■ (a<;^5 o V), C^,7v(H) - /iLp^,t), 

and so, bounding the first and second terms separately (using Cauchy- 
Schwarz for the second), 

V))\ < \\9 ■ (a^,^ o • \\C^,n{^) - fii.VcA\H-^{i) 

+ Lip(a^,5)-||e|U-\^||l^-Vi||i2(,). 
Next we note that for any ipji/j ^ Hq (/) we have 

D{(t)il:) = {D(l))^ + (t){Di;), //-a.e.; 

this can be seen directly, using the fact that Dtp and D(j) both exist as the 
usual limit of quotients fj.-a.e. It follows that (pip £ Hq{I), and that we can 
bound ill following way: 

iP^Pf + {DiPiP)f d^i < WPWl + / i{Dp)iP + p{DiP))^ diJ. 



I 



< ii'"i|ooll V^llL2(/) 



+ 2 j{{Dp)iPYdix + 2 J^{P{Di;)ydfi. 



34 



T. D. AUSTIN 



Applying this with cj) = 9 and tp = q^^^ oV , we obtain 



<ll^llLll«C,5°^lli^ 



+ 2^((Z)0)(aC,c ° V)f + {9{D{a(^^^ o V))f dfi 

<\mlo\\»c,i-v\\i2 

Now by Poincare's inequahty the norm || • ||oo is bounded by || • to 
within a multiphcative constant; since also a^^g is differentiable, Lip(Q;^^^) < 
oo and ||af,^||oo < oo, it follows that there is some C < oo for which 



whenever ||6'||//i(7) < 1- Replacing C by C V (v2£max^ ,cg^Lip(a^^^)) if nec- 
essary and substituting back into our bound for 

|(0,Q5,c,i(H,y))|, 

we obtain 

\{d,Q^,^,t{E,V))\ 

<C'(1 + ll^lliTi(/))l|Cc,iv(S) -/iLp^,t||/^-i(/) + C||y- vt||i2(j) 

when 11^11/^1(7) < 1- Summing over £ E to recover the terms of our original 
expression for V) and picking C5 = 2|ii^|C now gives the result. □ 

Corollary 4.7. There is a constant Cg > independent of N such that 
the process {C^^]\f{s[^^) — fJ-\-P(^t)o<t<T satisfies 

iir^ f"W\ II 
ll'-^^,Arl'='t ) - ^^^P^,t\\H-'^{I) 

< l|C5,Ar(Ho) - Mi-Pe,oll/f-i(/) 



\ 



+ CeVt^l\\vf^-^s 



1 1,2(7) ds 
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Proof. Integrating the inequality from the previous lemma in the de- 
composition of the difference process yields 

iir- II 

< l|C5,Ar(Ho) - ^lP5,o||h-1(/) 
rt 



+ C5 r ||Vf E l|Cc,iv(3f -/^.PC,.IIh-i(/) 



ds 



j-t 

+ C5 / II Vf^ - VsIIl2(/) ||M^^j||^_i(j). 

Now we apply the Cauchy-Schwarz inequality to each of the three integrals 
on the right-hand side to obtain 
llr-' ("W. II 

< l|C5,Ar(Ho) - /^lP5,oIIh-i(/) 



+ C5 



, /YEllCc,^(3i^^)-/^-Pc,. 



(is 



ds 



+ ll^5,i||H-i(/)- 

Another application of the Cauchy-Schwarz inequality, this time to the sum 
inside the first and second integrals, gives 

iir-' II 
ll'-^C,Afl'='t ) ~ f^^P(,t\\H-'^{l) 



I |£;|^||Cc,,v(Hf))-A^Lpc„ 



ds 



1^7(^)112 J 



/ |£^l E IICc,Jv(Hi^^) - ^lL.Y>(^,s\\H-^I) ds 
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+ C5ViJ f \\V^s 



(N) 



^s\\L'i(I) 



ds 



V JO 

+ \We.,t\\H-^{i)- 



This gives the desired result with 



C6 = C5(1V(Vt + ^|S|C4)) 



where C4 is the constant from Proposition 3.7 such that 



f 

Jo 



Vf)||^.(,)d.<C4 



for all > 1 and t G [0,r]. □ 

The above inequality is not yet of the particular form needed to apply 
Gronwall's lemma (since we need a Gronwall-like bound on the growth of 
the square of the i/~^-norm of the difference process); however, this requires 
only one further (slightly brutal) manipulation. 

Corollary 4.8. With Cq as in the previous lemma the process {C^^j\f(s[^^) — 
I^^Pi,t)o<t<T satisfies 



Proof. This follows from squaring the inequality from the previous 
lemma and applying the Cauchy-Schwarz inequality. □ 



4.5. The full result. We are now able to prove the full result (Theorem 
2.3): 




Theorem 4.9. Let e > 0, and suppose given initial conditions vq and 
p^fi- Then for any N sufficiently large, say N > Ni, there exists an initial 
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condition for (Sn) so that there is some ^^high-probability" J^i C $7 with 
¥{Q \ 0,1) < £ and such that 

sup ||VJ^^ - vtll^i,^) <e, 

0<i<T °^ ' 



sup ||C5,Ar(3l ) -P€,t||/f-i(n 
0<t<T 



on 



1- 



The proof will rely on the various estimates we have made so far in the 
paper, and so we first recall those of the relevant constants that we will need 
again explicitly. There are two of these: 

• C3 is a uniform bound on supo<f<y ||vj||oo and supQ<j<j. ||Vj^^^||oo, inde- 
pendent of A'^; 

• Cg is such that the process {C^^]\[{s[^^) — /ULp^^t)o<t<T satisfies 

iir^ i|2 

<4^||C5,7v(Ho) -/iLP5,o||H-i(/) 

+ C| ||Cc,;v(Hf))-/ULPc,.||^-i(,)ds 

+ Cit ||vf) - v,||i.(,) ds + ||M5,,||2,_,(,)^ . 

Proof of Theorem 2.3. The proof consists of a further sequence of 
estimates; we break it into five steps. 

For the first four steps we lower our sights slightly to showing that for N 
sufficiently large we can ensure the bounds 

iiaaC^) II 
sup II - Vf||^2(n < e, 

0<t<T ^ ' 

sup ||C5,Ar(Cj ) -p?,t|lH-i(n <e, 
0<t<T 

on some r^i with P(r2 \ Oi) < e; that is, our first estimate is now for || • ||i2(/) 
rather than || • ||ji/i(/). In Step 5 we will then bootstrap from this weakened 
result to the full theorem, using properties of the norms in question and the 
Feller semigroup {Pt)t>o- 
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Step 1. Calculating ^||v|^^ - Mh^i) from (Sjv-PDE) and (D-PDE), 
adding the diffusion term to both sides and rearranging gives 

|l|Vr)-v,||i.(,) + 2p(Vr)-v,)||i.(,) 



We obtain a bound on this last expression by treating the terms in these 
two sums separately. First we have 

(Vr)-v„C,.(HrVvr^-^.p,,-v,) 

= (Vr)-v„(C,,^(Hf))-^.p,,).vf)) 

+ (Vr)-v„^.p,,.(Vr)-v,)) 

= (Vf^lvf) - v,),C,,^(HrV/^.PC.) + ((Vf) - v,)^/..p,,). 

We now bound these two subterms separately. The second can be bounded 
directly by HVj^-* — vj||^2(/), and the first by 

||vf )(vf ) - Vi)||^.(,)||C^,^(3f )) -^.P5,||^^.(,) 

= ||(vf ) - v,)V v,(vf ) - Vi)||^i(,)||C5,;v(Hf VMLP«,t||^,-i(/) 

<(ll(Vr)-vt)lH„M7) 

+ l|vt(v|^^ - vt)||^i(^))||C5,jv(H^^^) - ^LP5,t||j^_i(^). 
Hence we obtain for the sum of the two subterms 
|(vf)-v„C,,^(Hr)).Vr)-M.p,,.v,)| 

<(ii(vr^-v*fii^i(,) 

+ ||v,(vf ) - vO||^i(,))||C5,^(Hf )) - f..p^,t\\H-Hi) 
+ l|vf)-v,||i.(,. 
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Similarly but more straightforwardly, we have for the terms in the second 
sum 

^ I iiiaaC^) II iiri ^"(^)\ II 

Adding these two inequalities in our original equation and integrating 
with respect to t gives 



(I) ds 



+ 2(maxc,)E/jl(Vf^-v.^^' 



X ||C5,7v(H^^)) -^iLp^,^ 1 1^:^-1 (^)ds 
+ ^(maxc^) glj ||V.(VW - v.)||^.(,) 

+ 2(maxc,)E/Vcll|Vf^-v.|lH„H/) 



+ 2 (maxc^) |i^| |J || Vf) - v,||i.(,) ds. 



This is not yet in a very useful form, but to go further we will need to look 
first at some parts of this expression in more detail. 

Step 2. In addition to ||vi''^^ — ^s\h^(iy the expressions ||(vi^^ — 

^s)^\hI{I) ~ '^s) II have crept into our working. We do 

not know any bounds on these quantities in particular, and would like to 
remove them altogether; it turns out that we can do this, using the fact that 
Vj^-', vj are uniformly bounded (Proposition 3.5). As in Section 4.4 we will 
use the inequality 



(</>V)' + p(0V))'d/i<ll<^llLllV'lli2+2 iUDcPW + 2 l {4>{Di;)Ydii 
for <P,i)(^Hl{I). 
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We will apply this twice. In the first case we take ip = (p = V^^'* — to 
find that 

ll(Vr^-v.f||^.(.)<||(Vr)-v,)^||i.(,) 



+ 4||Vr)-v,||LP(Vr)-v,)||i.(,) 

<l|Vr^-v.||L(l|Vr^-v,||W) 

+ 4||D(v(^)-vO||i.(,)). 

Second, we keep (p = ^t^^ ~ but take Tp = vt, to see that 

l|v*(vf)-v,)||^.(,) 

< ||vt(vf ) - Vi)||^,(,) +2||i^v,||Ll|vf ) - v,||i.(,) 

+ 2||vt||LP(vf^-Vi)||i.(,) 

<(K||^ + 2||Dv,||L)||vf^-v,||i.(,) 

+ 2||v,||LP(vf^-v,)||^,(,). 

Here is the appearance of ||Z)vf||oo that we will need to bound uniformly 
(recall Lemma 3.6). 

Step 3. Now we can return to our estimate from the end of Step 1. 
Applying the Cauchy-Schwarz inequality there a few times, we obtain 



|Vr)-v,||i.(,) + 2^*||D(Vr)-v,) 



Il2(/) ds 



+ 2(.„c,)|:^/||v.(V<-'-v, 



'J \\c^,N{'si'^^)- ^l^Pl:,s\\H-^I)ds 
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+ 



+ 2(maxc5) ||Vf ) - v^H^.^.^ ds. 

Next we will use the AM-GM inequality on the terms comprising products 
of square roots; however, we will apply it with a clever dodge well known 
in the study of PDE, observing that for any a,b > and any rj > this 
inequality gives us 

ab < ria^ + —1?. 

(See, e.g.. Section 9.1 of Evans [7].) 

Applying this to the above gives, for any r/ > 0, 



|Vr)-v,||i.(,) + 2^'p(vf)-v,)||i. 



(/) ds 



+ 2(maxc5) \E\v l|v.(Vf ) - v,)||^i(,) ds 

+ 2(^maxc5^ ^ ||C5,;v(3i^^) - /^^p^.s 11^-1(7) 

+ 2(^maxc5^ '^'''i^ ll^i^^ - v,||^i(^) 

+ 2(^maxc5^ (™'''^') ^ 4^/ " ^^P^'^H^^-M^) 

+ 2 (maxc^) \E\ || Vf) - v.H^.^,^ ds. 

Combining this with the inequalities from the end of Step 2, rearranging, 
and taking the chance to slim down our notation a bit, it follows that there 
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is some fixed C7 > (not depending on rj) such that 



^11^0 -vo||l2(j) 



+ ^C7 / 

Jo 

+ il + 7j)Cr r||Vf)-v,||i. 

JO 



(7) ds 



Now we can choose r] so small that 2 > r^Cy, and so deduce the simplified 
inequality 



- Vt||^2(j) < II Vq -Vo||2,2(7) 



JO 



L2(/) ^•^ 



this follows simply by dropping the terms involving ||D(v|^-' — vt)||^2(j)- 
The whole point of introducing t] was to allow us to do this; now we are left 
with terms that we know more about. 

Step 4. Recall next our growth inequality for the finite variation parts 
of the difference processes from Section 4.4: 

iir-i i|2 

< 4^||c^,iv(So) - ^^^p^,o\\H-^I) 

+ Ci f J2 l|Cc,iv(S(^)) - /XLpc,.||^-i(,) ds 

+ CitJ^ ||vf ) - v,||i.(,) ds + \\M^,t\\H-m)^ ■ 
Summing over E gives 
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(1) 



<4(^||C5,^(Ho)-/iLP5,o||?, 

+ Ci\E\ r^||Cc,,v(Hf))-/i.Pc,s|lH 
+ Ci\E\t llVf) - v,||i.(,) + 5: ||M5,||2,„, 



The end is now brought close with a monster apphcation of Gronwall's 
lemma. Add the growth inequality obtained at the end of Step 3 to the 
above to find 

W^f^ - Ml^i) + l|C5,7v(3t^^) - /iLP5,t||H-i(/) 

<l|Vr-vo|lW) 

+ 4 E l|C?,^(^o) - «,o|||-i(^) + IWktWn-^ii) 



i{l + v)Cr + iCi\E\t) ||Vf)-v,||i.(,)d. 

+ ^ ||C5,iv(3f )) - /iLP5,.||^-i(,) ds. 



9l 
rj 



Letting 

fit) = W'^f'' - vt|li2(/) + ||C^,iv(H|^^) - AiLP5,t||^-i(/), 

we see that our above inequality implies 

f{t)<A + B f f{s)ds, 
Jo 

and so (by Gronwall) f{t) < AE^^, where 

A = II V^^^ - vo||^2(j) + 4 E ||Cg,7v(Ho) - /iLP5,oll|-i(/) 

+ 4^ sup \\M^4jj-r,j. 



and 



5 = ((1 + 7^)Cr + 4C||i?|r) V ( ^ + 4C||i?| ) . 
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Here B does not depend on N, but by choosing N sufficiently large we can 
make A small in probability. Indeed, Vq^-* = vq = vq, and given ei > we 
note the following: 

1. from Proposition 4.5, for all sufficiently large A^, say N > Ni, we can find 
a subset f^i of O such that 

P(0\17i) <e 

and 



4 sup V \\M^,tfH-i < 



on all of r^i; 

2. for all sufficiently large N, say N > N2, we can choose Hq so that 

(this amounts to choosing N so large that we may approximate each fML.p^fi 
sufficiently well with a linear combination of 6i/j\i drawn from the same Hq; 
it is routine to see that this is possible). Therefore, for this choice of initial 
conditions and for N at least A'^i V we have for all t G [0, T] 

II - v,||i.(,), ||C5,;v(h1^^) - f^^Pal-HD ^ ^ ^i^"""^ 

on the large subset Qi C f]; choosing ei < e~^'^e completes the proof of the 
weakened estimates. 

Step 5. Finally we seek to improve our convergence result for the 
difference of potentials V^^) — v to convergence in Hq, by proving that for 
N sufficiently large we actually have 

sup ||D(vl^)-vO||i2(,)<e 

0<t<T ^ 

on Qi. It turns out that this follows quickly from what we already know and 
the integral forms of (D-PDE) and (Sat-FDE). Substituting from these and 
subtracting, we have 

Z?(Vr)-v,) 



i/N 

ieznNi° 
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Writing 



N 

i&mNi° 



and 



(all measures on /), our above expression for the derivative becomes 

D [' Pt^sXNds. 







Therefore we need to prove that 



sup 

0<t<T 







< £ 



on Vli for N sufficiently large. 

We now deploy again our trick of breaking the integral into two parts, say 
over (0,t — £2) and {t — e2-,t). 

However, this time we use also two more standard observations for our 
Feller semigroup {Pt)t>o-, which follow from the corresponding properties for 
the ordinary heat kernel: first, that for some fixed Cg > we have 

\DPu\{x)\<C^^P^/2\{x) 

for any finite positive measure A on /; and second, that there is some Cg > 
such that for any u > we have 



/ \PuX{x)\^fi{dx)<Cg^\\X\ 

Jl yju 



[recalling that ^ denotes Lebesgue measure, and writing ||A|| for the usual 
norm of a measure regarded as a linear functional on C{I)\. Combining these 
allows us to estimate the "awkward" part of our integral: for t — £2 < s <t 
and any finite positive measure A we have 



\DPt^sX\\LHi) = J J^\DPt-sX{x)\^ Kdx) 



1 



< CgCg-^y J^\P(t-s)/2X{x)\^ fi{dx) 

<^CsC,j^^-^J\X\\. 
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Hence, by the integral triangle inequality, 



D 



Pt-s^N ds 



L2(/) 



<2C8C9||A;v|| 



62 1 



u 



3/4 



du = 8C8C9||AAr||e2''^- 



Since HAjvH < W^inW + ll/^ooll is bounded uniformly in N (since 'v[^\ vj are 



bounded uniformly in N for t £ [0,T]), we can choose £2 so small that 

D f Pt^s^Nds 

Jt~e; 



for every N , every i G [0, T] and on the whole state space J7. 

On the other hand, owing to the smoothing properties of Pt~s for s <t 
bounded away from if we choose N large enough that 



sup \\V^ 

0<t<T 



(N) 



and 



sup wc^^NC^i^^) -p^Ah-hd 

0<t<T ^ 



are sufficiently small on the whole of the high-probability event Oi, then we 
can ensure that we also have 

Pt~s>^N ds 



D 







L2(7) 



for any t G [0, T] and on the whole of Combining these two estimates 
now gives the final result. □ 



5. Closing remarks. 



5.1. Appropriateness of the model. It is worth remarking on other stochas- 
tic models of nerve axons. These have tended to concentrate on using a white 
noise continuously distributed along the axon to model its stochastic nature, 
coupled via a suitable nonlinear parabolic PDE to the potential difference 
V in the same way as for our stochastic individual ion channels. This white- 
noise approach leads to a more traditional system of SPDE, for which there 
are corresponding existence and uniqueness results. A good introduction 
to this approach is given starting at Chapter 6 of Tuckwell [18], who also 
describes various stochastic approximations that save on computational ex- 
pense. 

However, it is not clear that the regime in which this SPDE model is really 
appropriate is very physically interesting, for in this regime the channel size 
must be considered negligible even though the fluctuations caused by their 
stochastic nature are not (this is the regime in which spatial white noise 
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arises). It is arguably more natural to build a model that takes individual 
ion channels into account and then, if desired, proceed directly to the deter- 
ministic limit given by the classical Hodgkin-Huxley equations. This is the 
procedure usually followed in physiology, and with which the present paper 
is concerned. 

5.2. Some further directions. We finish by describing some further di- 
rections in which the rigorous analysis of the stochastic Hodgkin-Huxley 
equations might be taken: 

1. The existence and convergence results proved above are of more academic 
than computational interest. However, they were originally motivated by 
a rather more practical problem. 

There has recently been growing interest in the deviation of a real 
axon from the deterministic behavior predicted by the Hodgkin-Huxley 
theory as a result of the stochastic nature of its components. In particular, 
Faisal, White and Laughlin [8] have investigated numerically the question 
of whether a sufficiently small axon might suffer frequent spontaneous 
action potentials generated by the chance event that a small number of 
Na"^ channels in close proximity stay open longer than usual and so cause 
a small initial rise in the membrane potential in their vicinity. They find 
that this can occur with a probability that increases greatly as the axonal 
diameter drops below about 0.1 /im. 

Faisal, White and Laughlin's approach uses a purpose-coded computer 
simulator of axonal behavior and has a very high computational expense. 
It would be valuable to have cleaner, analytic bounds on the probabilities 
(or, equivalently, the long-term rates) of such events occurring. One might 
conjecture that, in a suitable regime of many small ion channels but 
with time speeded up accordingly, spontaneous action potentials appear 
distributed roughly as a Poisson point process in the relevant space-time 
band [0, T] x /, where T is taken very large. 

Such a result could appear as a sort of large-deviation principle around 
a point of equilibrium for our whole system, analogous to the analysis of 
metastability through large-deviation theory for finite-dimensional dy- 
namical systems perturbed by a weak additive noise, as developed by 
Freidlin and Wentzell in [10]; however, their techniques would need some 
adaptation to suit the case of an infinite-dimensional system coupled to 
a large discrete system, as in the stochastic Hodgkin-Huxley model. The 
methods in the present paper do not seem to extend so far. 

In fact, this problem of determining the rate of spontaneous action 
potential generation fits naturally among various other questions that 
can be asked about the still-stochastic behavior of a real axon. Such a 
development in the special case of this model might also take into ac- 
count the different relative effects of the noise in the system while it 
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is undergoing different behavior: in particular, it seems tliat for realis- 
tic channel-numbers on patches of the axonal membrane there is much 
greater noise around local sub-threshold behavior (as when at equilib- 
rium) than super-threshold potentials (as during the transmission of the 
front of an action potential). Relatedly, this analysis might also consider 
the possible deterioration of an action potential in the stochastic model, 
for which a strict solitary wave solution may not exist. Finally, we men- 
tion that Steinmetz, Manwani and Koch [17] have recently studied the 
reliability in the times of the spikes output by a neuron considered as a 
transmission of the input times by modeling a small patch of ion chan- 
nels using the (nonspatial) stochastic Hodgkin-Huxley model (they also 
conduct a similar investigation of an alternative, the "integrate-and-fire" 
model). As they remark more generally, 

"The stochastic Markov version of the HH model converges to the clas- 
sical, deterministic model as the number of channels grows large, but for 
realistic channel numbers, the stochastic model can exhibit a wide variety 
of behaviors (spontaneous spiking, bursting, chaos, and so on) that cannot 
be observed in deterministic model ..." 

It remains to be seen to what extent a more analytic treatment can be 
given of these different behaviors. 
2. The model analyzed in this paper ignores possible external effects acting 
on the axon. A separate task which might be of interest is to perform the 
analysis of convergence in the case of an axon subject to some stimulus: 
for example, the arrival of a signal from the soma along the axon, modeled 
by changing the boundary conditions of our PDE (so that a particular 
input arrives at the soma end, and the conditions at the other end are 
free), or the application of a trans- membrane current along the length 
of the axon, as is sometimes used in experiments to stimulate an action 
potential. These stimuli could themselves be deterministic or stochastic; 
in the former case, one would expect the behavior of the stochastic model 
to converge to the trajectory of an appropriately modified PDE, while in 
the latter, even the limit model would have stochastic components. 

For deterministic inputs, it seems likely that the methods of the present 
paper could still be brought to bear. However, the details of the estimates 
needed for the proofs both of existence and regularity and then of con- 
vergence might become considerably more complicated, and we have not 
tried to work out the details. In the case of stochastic inputs, some fur- 
ther modification of the "hands-on" convergence machinery of Darling 
and Norris [4], as we have adapted it to our needs in this paper, would 
be needed, possibly using a suitable coupling of the full stochastic model 
and the limiting stochastic model as N ^ oo. 
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3. A different modification of the models studied in this paper would be 
to work over a membrane regarded as a two-dimensional surface. This 
could both increase the accuracy of the model studied here (recall our 
early simplifying assumption to treat the axon as a line segment rather 
than a cylinder), and possibly give an analogous account in the case of 
other, two-dimensional excitable membranes that appear in physiology 
(see, e.g., Hille [11]). As for the second extension mentioned above, it 
seems likely that the basic methods of this paper would still be useful, 
but in this case the required estimates would probably become much 
more difficult, and depend strongly on the underlying two-dimensional 
geometry (as far as we know, even the purely deterministic model has 
not been rigorously analyzed in this setting). 
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